Generalized Abbreviation Prediction with Negative Full Forms and Its Application on Improving Chinese Web Search

نویسندگان

  • Xu Sun
  • Wenjie Li
  • Fanqi Meng
  • Houfeng Wang
چکیده

In Chinese abbreviation prediction, prior studies are limited on positive full forms. This lab assumption is problematic in realworld applications, which have a large portion of negative full forms (NFFs). We propose solutions to solve this problem of generalized abbreviation prediction. Experiments show that the proposed unified method outperforms baselines, with the full-match accuracy of 79.4%. Moreover, we apply generalized abbreviation prediction for improving web search quality. Experimental results on web search demonstrate that our method can significantly improve the search results, with the search F-score increasing from 35.9% to 64.9%. To our knowledge, this is the first study on generalized abbreviation prediction and its application on web search.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Chinese Dataset with Negative Full Forms for General Abbreviation Prediction

Abbreviation is a common phenomenon across languages, especially in Chinese. In most cases, if an expression can be abbreviated, its abbreviation is used more often than its fully expanded forms, since people tend to convey information in a most concise way. For various language processing tasks, abbreviation is an obstacle to improving the performance, as the textual form of an abbreviation do...

متن کامل

Constructing Chinese Abbreviation Dictionary: A Stacked Approach

Abbreviation is a common linguistic phenomenon with wide popularity and high rate of growth. Correctly linking full forms to their abbreviations will be helpful in many applications. For example, it can improve the recall of information retrieval systems. An intuition to solve this is to build an abbreviation dictionary in advance. This paper investigates an automatic abbreviation generation me...

متن کامل

Automatic Chinese Abbreviation Generation Using Conditional Random Field

Boulder, Colorado, June 2009. c ©2009 Association for Computational Linguistics Automatic Chinese Abbreviation Generation Using Conditional Random Field Dong Yang, Yi-cheng Pan, and Sadaoki Furui Department of Computer Science Tokyo Institute of Technology Tokyo 152-8552 Japan {raymond,thomas,furui}@furui.cs.titech.ac.jp Abstract This paper presents a new method for automatically generating abb...

متن کامل

Coarse-grained Candidate Generation and Fine-grained Re-ranking for Chinese Abbreviation Prediction

Correctly predicting abbreviations given the full forms is important in many natural language processing systems. In this paper we propose a two-stage method to find the corresponding abbreviation given its full form. We first use the contextual information given a large corpus to get abbreviation candidates for each full form and get a coarse-grained ranking through graph random walk. This coa...

متن کامل

Cluster based Chinese abbreviation modeling

Abbreviations in Chinese are widely observed in Chinese spoken language. Automatic generation of Chinese abbreviations helps to improve Chinese natural language understanding systems and Chinese search engine. The abbreviation generation is treated as a character-based tagging problem. Due to limited training data, Chinese abbreviation generation suffers from data sparseness. Two types of strat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013